Overview

Dataset Statistics

Number of Variables 6
Number of Rows 103886
Missing Cells 0
Missing Cells (%) 0.0%
Duplicate Rows 0
Duplicate Rows (%) 0.0%
Total Size in Memory 18.6 MB
Average Row Size in Memory 187.8 B
Variable Types
  • Numerical: 4
  • Categorical: 2

Dataset Insights

index is uniformly distributed Uniform
payment_sequential is skewed Skewed
payment_installments is skewed Skewed
payment_value is skewed Skewed
order_id has a high cardinality: 99440 distinct values High Cardinality
order_id has constant length 32 Constant Length

Variables


index

numerical

Approximate Distinct Count 103886
Approximate Unique (%) 100.0%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 1662176
Mean 51942.5
Minimum 0
Maximum 103885
Zeros 1
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • index is uniformly distributed

Quantile Statistics

Minimum 0
5-th Percentile 5194.25
Q1 25971.25
Median 51942.5
Q3 77913.75
95-th Percentile 98690.75
Maximum 103885
Range 103885
IQR 51942.5

Descriptive Statistics

Mean 51942.5
Standard Deviation 29989.4494
Variance 8.9937e+08
Sum 5.3961e+09
Skewness 0
Kurtosis -1.2
Coefficient of Variation 0.5774
  • index is not normally distributed (p-value 0.0009073756540062123)

order_id

categorical

Approximate Distinct Count 99440
Approximate Unique (%) 95.7%
Missing 0
Missing (%) 0.0%
Memory Size 10076942

Length

Mean 32
Standard Deviation 0
Median 32
Minimum 32
Maximum 32

Sample

1st row b81ef226f3fe1789b1...
2nd row a9810da82917af2d9a...
3rd row 25e8ea4e93396b6fa0...
4th row ba78997921bbcdc137...
5th row 42fdf880ba16b47b59...

Letter

Count 1246540
Lowercase Letter 1246540
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 2077812
  • order_id contains many words: 99440 words
  • order_id has words of constant length

payment_sequential

numerical

Approximate Distinct Count 29
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 1662176
Mean 1.0927
Minimum 1
Maximum 29
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • payment_sequential is skewed right (γ1 = 16.1798)

Quantile Statistics

Minimum 1
5-th Percentile 1
Q1 1
Median 1
Q3 1
95-th Percentile 1
Maximum 29
Range 28
IQR 0

Descriptive Statistics

Mean 1.0927
Standard Deviation 0.7066
Variance 0.4993
Sum 113514
Skewness 16.1798
Kurtosis 370.5759
Coefficient of Variation 0.6467
  • payment_sequential is not normally distributed (p-value 4.454036188143844e-25)
  • payment_sequential has 4526 outliers

payment_type

categorical

Approximate Distinct Count 5
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 7771787
  • The largest value (credit_card) is over 3.88 times larger than the second largest value (boleto)

Length

Mean 9.8107
Standard Deviation 2.0617
Median 11
Minimum 6
Maximum 11

Sample

1st row credit_card
2nd row credit_card
3rd row credit_card
4th row credit_card
5th row credit_card

Letter

Count 940870
Lowercase Letter 940870
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 0
  • The top 2 categories (credit_card, boleto) take over 50.0%
  • The largest value (credit_card) is over 3.88 times larger than the second largest value (boleto)

payment_installments

numerical

Approximate Distinct Count 24
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 1662176
Mean 2.8533
Minimum 0
Maximum 24
Zeros 2
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • payment_installments is skewed right (γ1 = 1.655)

Quantile Statistics

Minimum 0
5-th Percentile 1
Q1 1
Median 1
Q3 4
95-th Percentile 10
Maximum 24
Range 24
IQR 3

Descriptive Statistics

Mean 2.8533
Standard Deviation 2.6871
Variance 7.2202
Sum 296423
Skewness 1.655
Kurtosis 2.5499
Coefficient of Variation 0.9417
  • payment_installments is not normally distributed (p-value 2.42003623097908e-22)
  • payment_installments has 6313 outliers

payment_value

numerical

Approximate Distinct Count 29077
Approximate Unique (%) 28.0%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 1662176
Mean 154.1004
Minimum 0
Maximum 13664.08
Zeros 9
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • payment_value is skewed right (γ1 = 9.2539)

Quantile Statistics

Minimum 0
5-th Percentile 26.1125
Q1 56.79
Median 100
Q3 171.8375
95-th Percentile 437.635
Maximum 13664.08
Range 13664.08
IQR 115.0475

Descriptive Statistics

Mean 154.1004
Standard Deviation 217.4941
Variance 47303.6678
Sum 1.6009e+07
Skewness 9.2539
Kurtosis 241.8167
Coefficient of Variation 1.4114
  • payment_value is not normally distributed (p-value 6.497825718489418e-25)
  • payment_value has 7981 outliers

Interactions

Correlations

Missing Values